Simultaneously Satisfying Linear Equations Over F 2 : 
MaxLin2 and Max-r-Lin2 Parameterized Above 

Average 



Robert Crowston 1 , Michael Fellows 2 , Gregory Gutin 1 , Mark Jones 1 , Frances 
Rosamond 2 , Stephan Thomasse 3 , and Anders Yeo 1 

1 Royal Holloway, University of London 
Egham, Surrey TW20 OEX, UK 
2 Charles Darwin University 
Darwin, Northern Territory 0909 Australia 
3 LIRMM-Universite Montpellier II 
34392 Montpellier Cedex, France 



Abstract. In the parameterized problem MAXLlN2-AA[fc], we are given a sys- 
tem with variables Xi,...,x n consisting of equations of the form Yliei Xi = ^ 
where Xi, b £ { — 1, 1} and I C [n], each equation has a positive integral weight, 
and we are to decide whether it is possible to simultaneously satisfy equations of 
total weight at least W/2 + k, where W is the total weight of all equations and k 
is the parameter (if k — 0, the possibility is assured). We show that MaxLin2- 
AA[fc] has a kernel with at most 0(k 2 log k) variables and can be solved in time 
2 o(fc log k) ( nm )0(i) This solyes an Qpen problem of Mahajan et al. (2006). 

The problem MAX-r-LlN2-AA[fc, r] is the same as MaxLin2-AA[/c] with two 
differences: each equation has at most r variables and r is the second parame- 
ter. We prove a theorem on MAX-r-LlN2-AA[fc, r] which implies that MAX-r- 
LlN2-AA[fc, r] has a kernel with at most (2fe— l)r variables improving a number 
of results including one by Kim and Williams (2010). The theorem also implies 
a lower bound on the maximum of a function / : { — 1, 1}" — 5- H of degree r. 
We show applicability of the lower bound by giving a new proof of the Edwards- 
Erdos bound (each connected graph on n vertices and m edges has a bipartite 
subgraph with at least m/2 + (n — l)/4 edges) and obtaining a generalization. 



1 Introduction 



1.1 MaxLin2-AA and Max-r-Lin2-AA. While MaxSat and its special case MAX-r- 
Sat have been widely studied in the literature on algorithms and complexity for many 
years, MaxLin2 and its special case MAX-r-LlN2 are less known, but Hastad [22] 
succinctly summarized the importance of these two problems by saying that they are 
"as basic as satisfiability." These problems provide important tools for the study of 
constraint satisfaction problems such as MaxSat and MAX-r-SAT since constraint 
satisfaction problems can often be reduced to MaxLin2 or MAX-r-LlN2, see, e.g., 
[1,2, 10, 11,22,24]. As a result, in the last decade, MaxLin2 and MAX-r-LlN2 have 
attracted significant attention in algorithmics. 



In the problem MaxLin2, we are given a system S consisting of m equations in 
variables x\, . . . , x n , where each equation is Y\ ieI Xi = bj and Xi, bj e { — 1, 1}, 
j = 1, . . . , m. Equation j is assigned a positive integral weight Wj and we wish to find 
an assignment of values to the variables in order to maximize the total weight of the 
satisfied equations. 

Let W be the sum of the weights of all equations in S and let sat (S) be the maxi- 
mum total weight of equations that can be satisfied simultaneously. To see that W/2 is 
a tight lower bound on sat(S') choose assignments to the variables independently and 
uniformly at random. Then W/2 is the expected weight of satisfied equations (as the 
probability of each equation being satisfied is 1/2) and thus W/2 is a lower bound; 
to see the tightness consider a system consisting of pairs of equations of the form 
Yliei Xi = — 1' Yliei Xi = 1 of the same weight, for some non-empty sets / C [n]. 
This leads to the following decision problem: 

MaxLin2-AA 

Instance: A system S of equations Y\ ieI Xi = bj, where Xi,bj € { — 1, 1}, 
j = 1, . . . , m; equation j is assigned a positive integral weight Wj, and a non- 
negative integer k. 
Question: sat(S) > W/2 + kl 

The maximization version of MaxLin2-AA (maximize k for which the answer is 
Yes), has been studied in the literature on approximation algorithms, cf. [22, 23]. These 
two papers also studied the following important special case of MaxLin2- AA: 

MAX-r-LlN2-AA 

Instance: A system S of equations Y\ ieI . Xi = bj, where x i: bj e { — 1, 1}, 
\Ij\ < r , i = 1, ■ ■ ■ ,m; equation j is assigned a positive integral weight Wj, 
and a nonnegative integer k. 
Question: sat(S) > W/2 + kl 

Hastad [22] proved that, as a maximization problem, MAX-r-LlN2-AA with any 
fixed r > 3 (and hence MaxLin2- AA) cannot be approximated within c for any c > 1 
unless P=NP (that is, the problem is not in APX unless P=NP). Hastad and Venkatesh 
[23] obtained some approximation algorithms for the two problems. In particular, they 
proved that for MAX-r-LlN2-AA there exist a constant c > 1 and a randomized 
polynomial-time algorithm that, with probability at least 3/4, outputs an assignment 
with an approximation ratio of at most c r y/m. 

The problem MaxLin2- AA was first studied in the context of parameterized com- 
plexity by Mahajan et al. [26] who naturally took k as the parameter 4 . We will denote 
this parameterized problem by MaxLin2-AA[&]. Despite some progress [10, 11,21], 
the complexity of MAXLlN2-AA[fc] has remained prominently open in the research 
area of "parameterizing above guaranteed bounds" that has attracted much recent atten- 
tion (cf. [1, 7, 10, 1 1, 21, 24, 26]) and that still poses well-known and longstanding open 
problems (e.g., how difficult is it to determine if a planar graph has an independent set 
of size at least (n/4) + fc?). One can parameterize MAX-r-LlN2-AA by k for any fixed 

4 We provide basic definitions on parameterized algorithms and complexity in Subsection 1.4 
below. 
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r (denoted by MAX-r-LlN2-AA[fc]) or by both k and r (denoted by MAX-r-LlN2- 
AA[fc,r]) 5 . 

Define the excess for x° = (xf, . . . , x°) e { — 1, l} n over S to be 



Note that ss(x°) is the total weight of equations satisfied by x° minus the total weight 
of equations falsified by x°. The maximum possible value of es(x°) is the maxi- 
mum excess of S. Hastad and Venkatesh [23] initiated the study of the excess and fur- 
ther research on the topic was carried out by Crowston et al. [11] who concentrated 
on MaxLin2-AA. In this paper, we study the maximum excess for MAX-r-LlN2- 
AA. Note that the excess is a pseudo-boolean function [9], i.e., a function that maps 
{ — 1, 1}™ to the set of reals. 

1.2 Main Results and Structure of the Paper. The main results of this paper are The- 
orems 3 and 4. In 2006 Mahajan et al. [26] introduced MAXLlN2-AA[fc] and asked 
what is its complexity. We answer this question in Theorem 3 showing that MaxLin2- 
AA[fc] admits a kernel with at most 0(k 2 log A;) variables. The proof of Theorem 3 
is based on the main result in [11] and on a new algorithm for MAXLlN2-AA[fc] of 
complexity n 2k (nm)° < - 1 > . We also prove that MAXLlN2-AA[fc] can be solved in time 
2°( k log fe ) (urn) ' 1 ' (Corollary 1). The other main result of this paper, Theorem 4, gives 
a sharp lower bound on the maximum excess for MAX-r-LlN2-AA as follows. Let S 
be an irreducible system (i.e., a system that cannot be reduced using Rule 1 or 2 defined 
below) and suppose that each equation contains at most r variables. Let n > (k— l)r+l 
and let w m [ n be the minimum weight of an equation of S. Then, in time mP^, we can 
find an assignment x° to variables of S such that es(x°) > k ■ w m ; n . 

In Section 2, we give some reduction rules for MAX-r-LlN2- AA, describe an algo- 
rithm H introduced by Crowston et al. [11] and give some properties of the maximum 
excess, irreducible systems and Algorithm H. In Section 3, we prove Theorem 3 and 
Corollary 1. A key tool in our proof of Theorem 4 is a lemma on a so-called sum-free 
subset in a set of vectors from F£ . The lemma and Theorem 4 are proved in Section 4. 
We prove several corollaries of Theorem 4 in Section 5. The corollaries are on param- 
eterized and approximation algorithms as well as on lower bounds for the maxima of 
pseudo-boolean functions and their applications in graph theory. Our results on param- 
eterized algorithms improve a number of previously known results including those of 
Kim and Williams [24]. We conclude the paper with Section 6, where we discuss some 
open problems. 

1.3 Corollaries of Theorem 4. The following results have been obtained for MAX-r- 
LlN2-AA[fc] when r is fixed and for MAX-r-LlN2-AA[fc, r]. Gutin et al. [21] proved 
that MAX-r-LlN2-AA[fc] is fixed-parameter tractable and, moreover, has a kernel with 
n < m — 0(k 2 ). This kernel is, in fact, a kernel of MAX-r-LlN2-AA[fc, r] with 

5 While in the preceding literature only MaxLin2-AA[/c] was considered, we introduce and 
study MAX-r-LlN2- AA[fe, r] in the spirit of Multivariate Algorithmics as outlined by Fellows 
[17] and Niedermeier [28]. 
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n < m = 0(9 r k 2 ). This kernel for MAX-r-LlN2-AA[fc] was improved by Crowston 
et al. [11], with respect to the number of variables, to n = O(fclogfc). For MAX-r- 
LlN2-AA[fc], Kim and Williams [24] were the first to obtain a kernel with a linear 
number of variables, i.e., n — O(k). This kernel is, in fact, a kernel with n < r(r + l)k 
for MAX-r-LlN2-AA[fc, r]. In this paper, we obtain a kernel with n < (2k — l)r 
for MAX-r-LlN2-AA[fc, r]. As an easy consequence of this result we show that the 
maximization problem MAX-r-LlN2-AA is in APX if restricted to m — 0(n) and the 
weight of each equation is bounded by a constant. This is in the sharp contrast with the 
fact mentioned above that for each r > 3, MAX-r-LlN2-AA is not in APX. 

Fourier analysis of pseudo-boolean functions, i.e., functions / : { — 1, 1}" — > R, 
has been used in many areas of computer science (cf. [1, 11,29]). In Fourier analysis, 
the Boolean domain is often assumed to be {—1, 1}™ rather than more usual {0, 1}" 
and we will follow this assumption in our paper. Here we use the following well-known 
and easy to prove fact [29]: each function / : { — 1, 1}" — >• M can be uniquely written 
as 

f(x) = f(<D) + J2f(I)l[x i . (1) 

ie? iei 

where T C {/ : ^ I C [n]}, [n] = {1,2, ... ,n} and /(/) are non-zero reals. 
Formula (1) is the Fourier expansion of / and /(/) are the Fourier coefficients of /. 
The right hand size of (1) is a polynomial and the degree max{|7| : I <G J 7 } of this 
polynomial will be called the degree of /. In Section 5, we obtain the following lower 
bound on the maximum of a pseudo-boolean function / of degree r: 

max/(:E) > /(0) + L(rank^ + r - l)/rj • min{|/(7)| :JeJ}, (2) 

X 

where A is a (0, l)-matrix with entries such that ay = 1 if and only if term j in (1) 
contains Xi (as rankA does not depend on the order of the columns in A, we may order 
the terms in (1) arbitrarily). 

To demonstrate the combinatorial usefulness of (2), we apply it to obtain a short 
proof of the well-known lower bound of Edwards-Erdos on the maximum size of a 
bipartite subgraph in a graph (the Max Cut problem). Erdos [15] conjectured and 
Edwards [14] proved that every connected graph with n vertices and m edges has a 
bipartite subgraph with at least m/2 + (n — l)/4 edges. For short graph-theoretical 
proofs, see, e.g., Bollobas and Scott [7] and Erdos et al. [16]. We consider the BAL- 
ANCED Subgraph problem [3] that generalizes Max Cut and show that our proof of 
the Edwards-Erdos bound can be easily extended to BALANCED SUBGRAPH, but the 
graph-theoretical proofs of the Edwards-Erdos bound do not seem to be easily extend- 
able to Balanced Subgraph. 

1.4 Parameterized Complexity and (Bi)kernelization. A parameterized problem is 
a subset L C S* x N over a finite alphabet E. L is fixed-parameter tractable (FPT, 
for short) if the membership of an instance (x, k) in S* x N can be decided in time 
f(k)\x\ c "^ 1 \ where / is a function of the parameter k only. When the decision time is re- 
placed by the much more powerful \x\°^^\ we obtain the class XP, where each prob- 
lem is polynomial-time solvable for any fixed value of k. There is an infinite number of 
parameterized complexity classes between FPT and XP (for each integer t > 1, there is 
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a class W[i]) and they form the following tower: FPT C W[l] C W[2] C • • • C IP. 
For the definition of the classes W[t], see, e.g., [18]. 

Given a pair L, L' of parameterized problems, a bikernelization from L to L' is 
a polynomial-time algorithm that maps an instance (x, k) to an instance (x', k') (the 
bikernel) such that (i) (x,k) G L if and only if (x',k') E L', (ii) k' < f(k), and 
(iii) | a;' | < g(k) for some functions / and g. The function g(k) is called the size of the 
bikernel. The notion of a bikernelization was introduced in [1], where it was observed 
that a parameterized problem L is fixed-parameter tractable if and only if it is decidable 
and admits a bikernelization from itself to a parameterized problem L'. A kernelization 
of a parameterized problem L is simply a bikernelization from L to itself; the bikernel is 
the kernel, and g{k) is the size of the kernel. Due to the importance of polynomial-time 
kernelization algorithms in applied multivariate algorithmics, low degree polynomial 
size kernels and bikernels are of considerable interest, and the subject has developed 
substantial theoretical depth, cf. [1,4-6, 12, 18-21]. 

The case of several parameters k\,. . . ,kt can be reduced to the one parameter case 
by setting k = k\ H + k t , see, e.g., [12]. 

2 Maximum Excess, Irreducible Systems and Algorithm H 

Recall that an instance of MaxLin2-AA consists of a system S of equations Ilie/ x % = 
bj, j G [m], where ^ Ij C [n], bj e { — 1,1}, Xi e { — 1,1}. An equation 
Ylieij Xi = has an integral positive weight Wj. Recall that the excess for x° = 
(xi, . . . e {-1, 1}" over S is e s (x°) = Y^Li c j IL G / 3 x t where c j = w 3 b r 
The excess es(x° ) is the total weight of equations satisfied by x° minus the total weight 
of equations falsified by x°. The maximum possible value of £s(x°) is the maximum 
excess of S. 

Remark 1. Observe that the answer to MaxLin2-AA is Yes if and only if the maxi- 
mum excess is at least 2k. 

Remark 2. The excess Es (x) is a pseudo-boolean function and its Fourier expression 
is ss(x) = 2~2]Li c j Ylieij Xi - Moreover, observe that every pseudo-boolean function 

f( x ) = Hier fi 1 ) Tiiei x i ( where /(0) = °) is the excess over the system Y[ ieI Xi = 
b u I e T, where b l = 1 if /(/) > and b l = -1 if /(/) < 0, with weights |/(7)|. 
Thus, studying the maximum excess over a MAXLlN2-AA-system (with real weights) 
is equivalent to studying the maximum of a pseudo-boolean function. 

Consider two reduction rules for MaxLin2 studied in [21]. 

Reduction Rule 1 If we have, for a subset I of [n], an equation Y\ i£l Xi — b\ with 
weight w'j, and an equation Y\ ieI Xi = b'[ with weight w", then we replace this pair by 
one of these equations with weight w'j + w'j ifb'j = b" and, otherwise, by the equation 
whose weight is bigger, modifying its new weight to be the difference of the two old 
ones. If the resulting weight is 0, we delete the equation from the system. 
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Reduction Rule 2 Let A be the matrix over F 2 corresponding to the set of equations 
in S, such that aji = 1 if i € Ij and 0, otherwise. Let t = rankA and suppose 
columns a 11 , . . . , a %t of A are linearly independent. Then delete all variables not in 
{xi ± , . . . , Xi t } from the equations of S. 

Lemma 1. [21 ] Let S' be obtained from S by Rule 1 or 2. Then the maximum excess of 
S' is equal to the maximum excess of S. Moreover, S' can be obtained from S in time 
polynomial in n and m. 

If we cannot change a weighted system S using Rules 1 and 2, we call it irreducible. 

Lemma 2. Let S' be a system obtained from S by first applying Rule 1 as long as 
possible and then Rule 2 as long as possible. Then S' is irreducible. 

Proof. Let S* denote the system obtained from S by applying Rule 1 as long as possi- 
ble. Without loss of generality, assume that x\ {xi 1 , . . . , Xi t } (see the description of 
Rule 2) and thus Rule 2 removes x\ from S* . To prove the lemma it suffices to show 
that after x\ removal no pair of equations has the same left hand side. Suppose that 
there is a pair of equations in S* which has the same left hand side after x\ removal; let 

Yliei' Xi ~ b' and Jlie/" Xi ~ ^" ^ e sucn e q uat i° ns an d let I' = I" U {1}. Then the 
entries of the first column of A, a 1 , corresponding to the pair of equations are 1 and 0, 
but in all the other columns of A the entries corresponding to the the pair of equations 
are either 1,1 or 0,0. Thus, a 1 is independent from all the other columns of A, a contra- 
diction. □ 

Let S be an irreducible system of MaxLin2- AA. Consider the following algorithm 
introduced in [11]. We assume that, in the beginning, no equation or variable in S is 
marked. 

Algorithm % 

While the system S is nonempty and the total weight of marked equations is 
less than 2k do the following: 

1. Choose an arbitrary equation Yliei Xi = ^ anc ' mar k an arbitrary variable 
xi such that I e /. 

2. Mark this equation and delete it from the system. 

3. Replace every equation riie/' x % = m tne system containing xi by 
riie/ziJ' x i = bb', where 1AV is the symmetric difference of / and V 
(the weight of the equation is unchanged). 

4. Apply Reduction Rule 1 to the system. 



Note that algorithm H replaces S with an equivalent system under the assumption 
that the marked equations are satisfied; that is, for every assignment of values to the 
variables xi,...,x n that satisfies the marked equations, both systems have the same 
excess. As a result, we have the following lemma. 

Lemma 3. [11] Let S be an irreducible system and assume that Algorithm % marks 
equations of total weight w. Then the maximum excess of S is at least w. In particular, 
ifw > 2k then S is a YES-instance of MAXLm2-AA[k]. 
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3 MaxLin2-AA 



The following two theorems form a basis for proving Theorem 3, the main result of this 
section. 

Theorem 1. There exists an n 2k (nm)°^ -time algorithm for MAXLlN2-AA/"fc/ that 
returns an assignment of excess of at least 2k if one exists, and returns NO otherwise. 

Proof. Suppose we have an instance C of MAXLlN2-AA[fc] that is reduced by Rules 1 
and 2, and that the maximum excess of C is at least 2k. Let A be the matrix introduced in 
Rule 2. Pick n equations e\, . . . , e„ such that their rows in A are linearly independent. 
Any assignment must either satisfy one of these equations, or falsify them all. We can 
check, in time (urn) ' 1 ', what happens if they are all falsified, as fixing the values of 
these n equations fixes the values of all the others. If falsifying all the equations does 
not lead to an excess of at least 2k, then any assignment of values to xi, . . . , x n that 
leads to excess at least 2k must satisfy at least one of e\, . . . , e n . Thus, by Lemma 3, 
algorithm H can mark one of these equations and achieve an excess of at least 2k. 

This gives us the following depth-bounded search tree. At each node N of the tree, 
reduce the system by Rules 1 and 2, and let n' be the number of variables in the reduced 
system. Then find n' equations e\, . . . , e n i corresponding to linearly independent vec- 
tors. Find an assignment of values to X\, . . . , x n i that falsifies all of e\, . . . , e n >. Check 
whether this assignment achieves excess of at least 2k — w*, where w* is total weight 
of equations marked by H in all predecessors of N. If it does, then return the assign- 
ment and stop the algorithm. Otherwise, split into n' branches. In the i'th branch, run 
an iteration of % marking equation e^. Then repeat this algorithm for each new node. 
Whenever the total weight of marked equations is at least 2k, return the suitable as- 
signment. Clearly, the algorithm will terminate without an assignment if the maximum 
excess of C is less than 2k. 

All the operations at each node take time (nm) ' 1 ', and there are less than n 2k+1 
nodes in the search tree. Therefore this algorithm takes time n 2k (nm)°^ . □ 

Theorem 2. [11 ] Let S be an irreducible system o/MAXLlN2-AA/"fc/ and let k > 2. 
If k < m < 2 n l^ k ~ v > — 2, then the maximum excess of S is at least k. Moreover, we 
can find an assignment with excess of at least k in time m ^. 

Theorem 3. The problem MaxLin2-AA/7c/ has a kernel with at most 0(k 2 \ogk) 
variables. 

Proof. Let C be an instance of MAXLlN2-AA[fc] and let S be the system of C with m 
equations and n variables. We may assume that £ is irreducible. Let the parameter k be 
an arbitrary positive integer. 

If m < 2k then n < 2k = 0(k 2 logfc). If 2k < m < 2™/( 2fe - 1 ) - 2 then, by 
Theorem 2, the answer to C is YES and the corresponding assignment can be found in 
polynomial time. If m > n 2k then, by Theorem 1, we can solve C in polynomial time. 

Finally we consider the case 2™/( 2 ' £ ~ 1 ' — 1 < m < n 2k — 1. Hence, n 2k > 
2 «/(2fc-i) xhere f ore5 4k 2 > 2 + n/logn > ^/n and n < {2k) 4 . Hence, n < 
4fc 2 log n < 4fc 2 log(16fc 4 ) = 0(k 2 log k). 

Since S is irreducible, m < 2" and thus we have obtained the desired kernel. □ 
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Corollary 1. The problem MaxLin2-AA/7c/ can be solved in time 2 0( - k log fc ) (iim)° (1) . 

Proof. Let L be an instance of MAxLlN2-AA[fc]. By Theorem 3, in time (nm)°^ 
either we solve C or we obtain a kernel with at most 0(fc 2 logfc) variables. In the 
second case, we can solve the reduced system (kernel) by the algorithm of Theorem 1 
in time [0(k 2 log k)] 2k [0(k 2 \ogk)m]°W = 2°( kl °s k ) m oW . Thus, the total time is 

2 0(fclogfc)( nm )0(l)_ n 



4 Max-r-Lin2-AA 

In order to prove Theorem 4, we will need the following lemma on vectors in F£ . Let 
M be a set of m vectors in F£ and let A be a m x n-matrix in which the vectors of 
M are rows. Using Gaussian elimination on A one can find a maximum size linearly 
independent subset of M in polynomial time [25]. Let K and M be sets of vectors in 
F£ such that K C M. We say K is M-sum-free if no sum of two or more distinct 
vectors in K is equal to a vector in M. Observe that K is M-sum-free if and only if K 
is linearly independent and no sum of vectors in K is equal to a vector in M\K. 

Lemma 4. Let Mbea set of vectors in such that M contains a basis ofW^. Suppose 
that each vector of M contains at most r non-zero coordinates. If k > 1 is an integer 
andn > r(k — 1) + 1, then in time \M\°^ 1 \ we can find a subset K of M of k vectors 
such that K is M-sum-free. 

Proof. Let 1 = (1, . . . , 1) be the vector in F£ in which every coordinate is 1. Note 
that 1 M. By our assumption M contains a basis of F£ and we may find such a 
basis in polynomial time (using Gaussian elimination, see above). We may write 1 as a 
sum of some vectors of this basis B. This implies that 1 can be expressed as follows: 
1 = v\ +v 2 + • • • +v s , where {vi , . . . , v s } C B and v\ , . . . , v s are linearly independent, 
and we can find such an expression in polynomial time. 

For each v e M\{v\, . . . , v s }, consider the set S v — {v 7 v\, . . . , v s }. In polynomial 
time, we may check whether S v is linearly independent. Consider two cases: 

Case 1: S v is linearly independent for each v e M\{«i, . . . , v s }. Then {v\, . . . , v s } 
is M-sum-free (here we also use the fact that {vi, . . . , v s } is linearly independent). 
Since each Vi has at most r positive coordinates, we have sr > n > r(k — 1). 
Hence, s > k — 1 implying that s > k. Thus, {v\, . is the required set K. 

Case 2: S v is linearly dependent for some v € M\{vi, . . . , v s }. Then we can find 
(in polynomial time) / C [s] such that v = ^2 ieI «i- Thus, we have a shorter 
expression for 1: 1 = v[ +v' 2 + - ■ • + v' s ,, where {v' l7 . . . , v' s ,} = {v}Ll{vi : i £ I}. 
Note that {v' l7 . . . , v' s ,} is linearly independent. 

Since s < n and Case 2 produces a shorter expression for 1, after at most n itera- 
tions of Case 2 we will arrive at Case 1 . □ 

Now we can prove the main result of this section. 
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Theorem 4. Let S be an irreducible system and suppose that each equation contains 
at most r variables. Let n > (k — l)r + 1 and let w m i n be the minimum weight of an 
equation of S. Then, in time m ^, we can find an assignment x° to variables of S 
such that es{x°) > k ■ w m ; n . 

Proof. Consider a set M of vectors in F£ corresponding to equations in S as follows: 
for each equation Yl ieI Xi — b in S, define a vector v = (v\, . . . ,v n ) £ M, where 
Vi = 1 if i S / and Vi = 0, otherwise. 

As S is reduced by Rule 2 we have that M contains a basis for F£ , and each vec- 
tor contains at most r non-zero coordinates and n > (k — l)r + 1. Therefore, using 
Lemma 4 we can find an M-sum-free set K of k vectors. Let {ej 1 , . . . , ej k } be the 
corresponding set of equations. Run Algorithm H, choosing at Step 1 an equation of 
S from {ej 1 , . . . , ej k } each time, and let S' be the resulting system. Algorithm H will 
run for k iterations of the while loop as no equation from {ej 1 , . . . , ej k } will be deleted 
before it has been marked. 

Indeed, suppose that this is not true. Then for some ej t and some other equation e 
in S, after applying Algorithm T~L for at most I — 1 iterations ej [ and e contain the same 
variables. Thus, there are vectors Vj € K and v € M and a pair of nonintersecting 
subsets K' and K" of K \ {v, vj} such that Vj + J2ueK' u = v + J2ueK" u - Thus, 
v = vj + ^2 ue K'uK" M ' a contradiction with the definition of K. 

Thus, by Lemma 3, we are done. □ 

Remark 3. To see that the inequality n > r(k— 1) + 1 in the theorem is best possible as- 
sume that n = r(k — 1) and consider a partition of [n] into k—1 subsets iVi, . . . , Nk-i, 
each of size r. Let S be the system consisting of subsystems Si, i € [k — 1], such that 
a subsystem Si is comprised of equations Yl ieI Xi = — 1 of weight 1 for every I such 
that 7^ I C Ni. Now assume without loss of generality that N t = [r]. Observe that 
the assignment (x\, . . . , x r ) — (1, . . . , 1) falsifies all equations of Si but by setting 
Xj = —1 for any j e [r] we satisfy the equation Xj = —1 and turn the remaining 
equations into pairs of the form Yl ieI Xi = —1 and Y\ ieI Xi = 1. Thus, the maximum 
excess of Si is 1 and the maximum excess of S is k — 1. 

Remark 4. It is easy to check that Theorem 4 holds when the weights of equations in S 
are real numbers, not necessarily integers. 



5 Applications of Theorem 4 

Theorem 5. The problem MAX-r-LlN2-AA/7c, r] has a kernel with at most (2k — l)r 
variables. 

Proof. Let T be the system of an instance of MAX-r-LlN2-AA[fc, r]. After applying 
Rules 1 and 2 to T as long as possible, we obtain a new system S which is irreducible. 
Let n be the number of variables in S and observe that the number of variables in an 
equation in S is bounded by r (as in T). If n > (2k — l)r + 1, then, by Theorem 4 and 
Remark 1, 5 is a YES-instance of MAXLlN2-AA[fc, r] and, hence, by Lemma 1, S and 
T are both YES-instances of MAXLlN2-AA[fc, r]. Otherwise n < (2k - l)r and we 
have the required kernel. □ 
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Corollary 2. The maximization problem MAX-r-LlN2- AA is in APX if restricted to 
m = 0(n) and the weight of each equation is bounded by a constant. 

Proof. It follows from Theorem 5 that the answer to MAX-r-LlN2-AA, as a decision 
problem, is Yes as long as 2k < [(n + r — 1) jr\ . This implies approximation ratio at 
most W/ (2[(n + r — l)/rj ) which is bounded by a constant provided m = 0(n) and 
the weight of each equation is bounded by a constant (then W = 0(n)). □ 

The (parameterized) Boolean Max-r-Constraint Satisfaction Problem (MAX-r-CSP) 
generalizes MaxLin2-AA[/c, r] as follows: We are given a set <P of Boolean functions, 
each involving at most r variables, and a collection T of m Boolean functions, each 
/ G T being a member of <P, and each acting on some subset of the n Boolean vari- 
ables xi,X2, ■ ■ ■ ,x n (each Xi e { — 1, 1})- We are to decide whether there is a truth 
assignment to the n variables such that the total number of satisfied functions is at least 
E+k, where E is the average value of the number of satisfied functions. The parameters 
are k and r. 

Using a bikernelization algorithm described in [1, 11] and our new kernel result, 
it easy to see that MAX-r-CSP with parameters k and r admits a bikernel with at 
most (k2 r+1 — l)r variables. This result improves the corresponding result of Kim and 
Williams [24] (n < kr{r + l)2 r ). 

The following result is essentially a corollary of Theorem 4 and Remark 4. 

Theorem 6. Let 

f{x) = f{%) + Y,f{I)J{xi (3) 

ieT iei 

be a pseudo-boolean function of degree r. Then 

max/O) > /(0) + L(rank^ + r - l)/rj • min{|/(7)| :Je7}, (4) 

X 

where A is a (0, \)-matrix with entries such that a-ij = 1 if and only if term j in (3) 
contains Xi. One can find an assignment of values to x satisfying (4) in time (n\ J 7 !) ' 1 ) . 

Proof. By Remark 2 the function f(x) — /(0) = ^2 Ie jr f (I) Yiiei Xi ^ s me excess 
over the system Yiiei x i = bj, I E T, where bj = +1 if /(/) > and 6/ = —1 if 
/(/) < 0, with weights |/(/)|. Clearly, Rule 1 will not change the system. Using Rule 
2 we can replace the system by an equivalent one (by Lemma 1) with rankA variables. 
By Lemma 2, the new system is irreducible and we can now apply Theorem 4. By this 
theorem, Remark 2 and Remark 4, m&x x f(x) > /(0) + fc*min{|/(7)| : I e J 7 }, 
where k* is the maximum value of k satisfying rankA > (k — l)r + 1. It remains to 
observe that k* = [(rankA + r — l)/rj . □ 

To give a new proof of the Edwards-Erdos bound, we need the following well- 
known and easy-to-prove fact [8]. For a graph G = (V, E), an incidence matrix is a 
(0, l)-matrix with entries m e _ v , e e E, v e V such that m e ^ v = 1 if and only if v is 
incident to e. 

Lemma 5. The rank of an incident matrix M of a connected graph equals \ V\ — 1. 
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Theorem 7. Let G = (V, E) be a connected graph with n vertices and to edges. Then 
G contains a bipartite subgraph with at least ^ + edges. Such a subgraph can be 
found in polynomial time. 

Proof. Let V = {v\, V2, ■ ■ ■ , v n } and let c : V — > { — 1, 1} be a 2-coloring of G. 
Observe that the maximum number of edges in a bipartite subgraph of G equals the 
maximum number of properly colored edges (i.e., edges whose end-vertices received 
different colors) over all 2-colorings of G. For an edge e = v^j £ E consider the 
following function f e (x) = \{1 — XiXj), where Xi — c(vi) and Xj — c(vj) and 
observe that f e (x) — 1 if e is properly colored by c and f e (x) — 0, otherwise. Thus, 
f( x ) = See-E /e( x ) * s tne number of properly colored edges for c. We have f(x) = 
f ~ 5 Eee-E x * x J- B y Theorem 6, mm, f(x) > m/2 + [(rankA + 2 - l)/2j/2. 
Observe that matrix A in this bound is an incidence matrix of G and, thus, by Lemma 

5 rankA = n - 1. Hence, max,. / (x) >Y + iRfJ>T + ^ aS rec l uired - D 

This theorem can be extended to the BALANCE SUBGRAPH problem [3], where we 
are given a graph G = (V, E) in which each edge is labeled either by = or by ^ and we 
are asked to find a 2-coloring of V such that the maximum number of edges is satisfied; 
an edge labeled by = (7^, resp.) is satisfied if and only if the colors of its end-vertices 
are the same (different, resp.). 

Theorem 8. Let G = (V, E) be a connected graph with n vertices and m edges labeled 
by either = or 7^. There is a 2-coloring ofV that satisfies at least ^ + IL ^- edges ofG. 
Such a 2-coloring can be found in polynomial time. 

Proof. Let V = {vi, V2, ■ ■ ■ , v n } and let c : V — > { — 1, 1} be a 2-coloring of G. Let 
x p = c(v p ), p e [n]. For an edge ViVj € E we set = 1 if v^j is labeled by ^ and 
Sij — —1 if ViVj is labeled by =. Then the function i 2~2 V v ge(^ ~ SijXiXj) counts 
the number of edges satisfied by c. The rest of the proof is similar to that in the previous 
theorem. □ 

6 Open Problems 

Another question of Mahajan et al. [26] remains open: what is the parameterized com- 
plexity of deciding whether a connected graph on n vertices and m edges has a bipartite 
subgraph with at least m/2 + (n — l)/4 + k edges, where k is the parameter. Fixed- 
parameter tractability of a weaker problem was proved by Bollobas and Scott [7] a 
decade ago. 

The kernel obtained in Theorem 3 is not of polynomial size as it is not polynomial 
in to. The existence of a polynomial-size kernel for MaxLin2- AA[fc] remains an open 
problem. 

Perhaps the kernel obtained in Theorem 3 or the algorithm of Corollary 1 can be 
improved if we find a structural characterization of irreducible systems for which the 
maximum excess is less than 2k. Such a characterization can be of interest by itself. 

Let F be a CNF formula with clauses C\, . . . , C m of sizes n, . . . ,r m . Since the 
probability of d being satisfied by a random assignment is 1 — 2~ n , the expected 
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(average) number of satisfied clauses is E = — 2~ n ). It is natural to consider 

the following parameterized problem MAXSAT-AA[fc]: decide whether there is a truth 
assignment that satisfies at least E + k clauses. When there is a constant r such that 
\d\ < r for each i = 1, . . . ,ra, MAXSAT-AA[fc] is denoted by MAX-r-SAT-AA[fc]. 
Mahajan et al. [26] asked what is the complexity of MAX-r-S AT- AA[fc] and Alon et al. 
[1] proved that it is fixed-parameter tractable [1]. It would be interesting to determine 
the complexity of MAXSAT-AA[fc]. 
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